-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add more partitions and update unit test #116
Conversation
Thanks Yue, this is really helpful! The PR generally looks good, but the GPU can be tricky, in sbatch we need to explicitly ask general resources like GPU with I tried starting a jupyter notebook with GPU on gpu2 using the env_starter branch, I can start a job with 28 CPUs, but no GPU access is given. Maybe we are not allocated to have midway GPUs anymore? If that's the case we don't need to implement my GPU gres comment and can directly merge the PR. |
@shenyangshi So if I understand correctly, we haven't successfully used GPU even on gpu2 partition? |
I think originally we could access gpu2, see slack from Andrii, now I'm not sure, I haven't successfully used it. |
@shenyangshi I just tried with |
Sounds good, thanks |
@shenyangshi After diving into this, I realized that it's not trivial to set up the GPUs for
Therefore, the
|
Thanks Yue for the hard work and detailed check! I totally agree we can use it as a CPU-only node now. |
As @shenyangshi mentioned in #115, there are more possible partitions than what we previously listed in
utilix.batchq
and the actual available onesbigmem2
andgpu2
are added here (see Yue's reponse in #115)Besides, the unit test for
batchq
has also been updated to ensure that all of the partitions listed here are actually supported. I have tested onmidway2
,midway3
anddali
and the tests were all passed.